Lecture 13

Introduction to Distance sampling with inlabru

Sara Martino

Dept. of Mathematical Science, NTNU

Janine Illian

University of Glasgow

Jafet Belmont

University of Glasgow

Distance Sampling

Overview of Distance Sampling

  • Distance sampling is a family of related methods for estimating the abundance and spatial distribution of wild populations.

  • Distance sampling is based on the idea that animals further away from observers are harder to detect than animals that are nearer.

  • This idea is implemented in the model as a detection function that depends on distance.

    • Species at greater distances are harder to detect and the detection function therefore declines as distance increases.

Density surface models

Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.

  • Traditionally, this is achieved in a two-stages approach by (i) using a detectability point estimates to create an offset vector to (ii) use within GLM or GAM for count response data.

Density surface models

Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.

  • Traditionally, this is achieved in a two-stages approach by (i) using a detectability point estimates to create an offset vector to (ii) use within GLM or GAM for count response data.

  • This requires binning the data into counts based on some discretisation of space.

Density surface models

Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.

  • Traditionally, this is achieved in a two-stages approach by (i) using a detectability point estimates to create an offset vector to (ii) use within GLM or GAM for count response data.

  • A major downside to this approach is the propagation of uncertainty from the detection model to the second-stage spatial model.

  • The goal: one-stage distance sampling model, simultaneously estimating the detectability and the spatial distribution of animals using a thinned point process framework.

Thinned Point Process

Thinned Point Process

The LGCP is a flexible approach that can include spatial covariates to model the mean intensity and a mean-zero spatially structured random effect to account for unexplained heterogeneity not captured by the covariates.

  • To account for the imperfect detection of points we specify a thinning probability function \[ g(s) = \text{Prob}(\text{a point at s is detected}|\text{a point is at s}) \]

  • A key property of LGCP is that a realisation of a point process with intensity \(\lambda(s)\) that is thinned by probability function \(g(s)\), follows also a LGCP with intensity:

\[ \underbrace{\tilde{\lambda}(s)}_{\text{observed process}} = \underbrace{\lambda(s)}_{\text{true process}} \times \underbrace{g(s)}_{\text{thinning probability}} \]

Thinned Point Process

Lets visualize this on 1D: Intensity function with points

Thinned Point Process

Intensity (density) function with points and transect locations

Thinned Point Process

  • Detection function \(\color{red}{g(s)}\)

  • Here \(\color{red}{g(s) =1}\) on the transects (at x = 10,30 and 50).

Thinned Point Process

  • Detection function \(\color{red}{g(s)}\) and detected points

Thinned Point Process

Thinned Point Process

The detection function describes the probability \(\color{red}{p(s)}\) that an point is detected

Thinned Point Process

Thinned Point Process

Observations are from a thinned Poisson process with intensity \(\lambda(s) \color{red}{p(s)}\)

Detection Function

  • Standard distance sampling approaches specify \(g(s)\) as a function that declines with increasing distance
    • horizontal distance to the observer for point transects
    • perpendicular distance to the transect line for line transects

Detection Function

  • Standard distance sampling approaches specify \(g(s)\) as a function that declines with increasing distance
    • horizontal distance to the observer for point transects
    • perpendicular distance to the transect line for line transects

Detection Function

  • Standard distance sampling approaches specify \(g(s)\) as a function that declines with increasing distance
    • horizontal distance to the observer for point transects
    • perpendicular distance to the transect line for line transects
  • The thinning probability function is specified as a parametric family of functions.

Half-normal: \(g(\mathbf{s}|\sigma) = \exp(-0.5 (d(\mathbf{s})/\sigma)^2)\)

Hazard-rate :\(g(\mathbf{s}|\sigma) = 1 - \exp(-(d(\mathbf{s})/\sigma)^{-1})\)

Detection Function

  • Standard distance sampling approaches specify \(g(s)\) as a function that declines with increasing distance
    • horizontal distance to the observer for point transects
    • perpendicular distance to the transect line for line transects
  • The thinning probability function is specified as a parametric family of functions.
  • The thinned-LGCP likelihood is given by:

\[ \pi(\mathbf{s_1},\ldots,\mathbf{s_m}) = \exp\left( |\Omega| - \int_{\mathbf{s}\in\Omega}\lambda(s)g(s)\text{d}s \right) \prod_{i=1}^m \lambda(\mathbf{s}_i)g(\mathbf{s}_i) \]

  • To make \(g(s)\) and \(\lambda(s)\) identifiable, we assume intensity is constant with respect to distance from the observer.

    • In practice this means we assume animals are uniformly distributed with respect to distance from the line

Putting all the pieces together

  • The true point pattern \(Y = \mathbf{s}_1,\ldots,\mathbf{s}_n\) are a realization of a Point process with intensity \(\lambda(s)\)

Putting all the pieces together

  • The true point pattern \(Y = \mathbf{s}_1,\ldots,\mathbf{s}_n\) are a realization of a Point process with intensity \(\lambda(s)\)
  • We design a sampling survey to collect the data along transects

Putting all the pieces together

  • The true point pattern \(Y = \mathbf{s}_1,\ldots,\mathbf{s}_n\) are a realization of a Point process with intensity \(\lambda(s)\)
  • We design a sampling survey to collect the data along transects
  • detected points are generated from the thinned PP with intensity \(\color{red}{\tilde{\lambda}(s)}= \lambda(s)\color{red}{g(d(s))}\)
    • The log intensity \(\log \color{red}{\tilde{\lambda}(s)} = \overbrace{\log \lambda (s)}^{\mathbf{x}'\beta + \xi(s)} + \overbrace{\log \color{red}{g(d(s))}}^{-0.5~d(\mathbf{s})^2\sigma^{-2}}\)

Putting all the pieces together

  • The true point pattern \(Y = \mathbf{s}_1,\ldots,\mathbf{s}_n\) are a realization of a Point process with intensity \(\lambda(s)\)
  • We design a sampling survey to collect the data along transects
  • detected points are generated from the thinned PP with intensity \(\color{red}{\tilde{\lambda}(s)}= \lambda(s)\color{red}{g(d(s))}\)
    • The log intensity \(\log \color{red}{\tilde{\lambda}(s)} = \overbrace{\log \lambda (s)}^{\mathbf{x}'\beta + \xi(s)} + \overbrace{\log \color{red}{g(d(s))}}^{-0.5~d(\mathbf{s})^2\sigma^{-2}}\)
    • The encounter rate, i.e. the number of observed animals within a distance \(W\) follows \(m \sim \text{Poisson} \left(\int_0^W \tilde{\lambda}(d)\text{d}d\right)\)
  • The pdf of detected distances is \(\pi(d_1,\ldots,d_m|m) \propto \prod_{i=1}^m\dfrac{\tilde{\lambda}(d_i)}{\int_0^W \tilde{\lambda}(d)\text{d}d}\)

Putting all the pieces together

  • The true point pattern \(Y = \mathbf{s}_1,\ldots,\mathbf{s}_n\) are a realization of a Point process with intensity \(\lambda(s)\)
  • We design a sampling survey to collect the data along transects
  • detected points are generated from the thinned PP with intensity \(\color{red}{\tilde{\lambda}(s)}= \lambda(s)\color{red}{g(d(s))}\)
    • The log intensity \(\log \color{red}{\tilde{\lambda}(s)} = \overbrace{\log \lambda (s)}^{\mathbf{x}'\beta + \xi(s)} + \overbrace{\log \color{red}{g(d(s))}}^{-0.5~d(\mathbf{s})^2\sigma^{-2}}\)
    • The encounter rate, i.e. the number of observed animals within a distance \(W\) follows \(m \sim \text{Poisson} \left(\int_0^W \tilde{\lambda}(d)\text{d}d\right)\)
  • The pdf of detected distances is \(\pi(d_1,\ldots,d_m|m) \propto \prod_{i=1}^m\dfrac{ g(d_i)}{\int_o^W g(d) \text{d}d}\) if \(\color{red}{\tilde{\lambda}(d_i)} = \lambda \color{red}{g(d_i)}\)

An approximation: Strips as lines

  • If the strips width ( \(2W\) ) is narrow compared to study region (\(\Omega\)) we can treat them as lines.

    • We need to adjust the intensity at a point \(\mathbf{s}\) on the line to take account of the actual width of the strip

    • Adjust the thinning probability to account for having collapsed all points onto the line.

An approximation: Strips as lines

The intensity at a point \(\mathbf{s}\) on the line becomes \(2W\lambda(s)\) instead of \(\lambda(s)\).

  • Let \(\pi(d)\) be the probability that the point is at a distance \(d\) from the line.

  • Let \(p(d)\) be the probability that is detected given it is at \(d\).

Then, the thinning probability becomes \(\pi(d)\times p(d)\), assuming the points are uniformly distributed within the strip then \(\pi(d) = 1/W\) (the density of distances is assumed to be constant on the interval \([0,W]\)).

This updates our thinning intensity to

\[ \log \tilde{\lambda}(s) = \underbrace{\mathbf{x}'\beta + \xi(s)}_{\log \lambda(s)} + \log p(d) + \log \times(2/2W) \]

  • Typically \(p(d)\) is a non-linear function, that is where inlabru can help via a Fixed point iteration scheme (further details available in this vignette)

Example

Example: Dolphins in the Gulf of Mexico

In the next example, we will explore data from a combination of several NOAA shipboard surveys conducted on pan-tropical spotted dolphins in the Gulf of Mexico.

  • A total of 47 observations of groups of dolphins were detected. The group size was recorded, as well as the Beaufort sea state at the time of the observation.

  • Transect width is 16 km, i.e. maximal detection distance 8 km (transect half-width 8 km).

Step 1: Define the SPDE representation: The mesh

First, we need to create the mesh used to approximate the random field. We can either:

  1. Create a nonconvex extension of the points using the fm_mesh_2d and fm_nonconvex_hull functions from the fmesher package:

  • max.edge for maximum triangle edge lengths
  • cutoff to avoid overly small triangles in clustered areas

Step 1: Define the SPDE representation: The mesh

First, we need to create the mesh used to approximate the random field. We can either:

  1. Use a pre-define sf boundary and specify this directly into the mesh construction via the fm_mesh_2d function

  • max.edge for maximum triangle edge lengths
  • cutoff to avoid overly small triangles in clustered areas

Step 1: Define the SPDE representation: The mesh

  • All random field models need to be discretised for practical calculations.

  • The SPDE models were developed to provide a consistent model definition across a range of discretisations.

  • We use finite element methods with local, piecewise linear basis functions defined on a triangulation of a region of space containing the domain of interest.

  • Deviation from stationarity is generated near the boundary of the region.

  • The choice of region and choice of triangulation affects the numerical accuracy.

Step 1: Define the SPDE representation: The mesh

  • Too fine meshes \(\rightarrow\) heavy computation

  • Too coarse mesh \(\rightarrow\) not accurate enough

Step 1: Define the SPDE representation: The mesh

Some guidelines

  • Create triangulation meshes with fm_mesh_2d():

  • edge length should be around a third to a tenth of the spatial range

  • Move undesired boundary effects away from the domain of interest by extending to a smooth external boundary:

  • Use a coarser resolution in the extension to reduce computational cost (max.edge=c(inner, outer)), i.e., add extra, larger triangles around the border

Step 1: Define the SPDE representation: The mesh

  • Use a fine resolution (subject to available computational resources) for the domain of interest (inner correlation range) and avoid small edges ,i.e., filter out small input point clusters (0 \(<\) cutoff \(<\) inner)

  • Coastlines and similar can be added to the domain specification in fm_mesh_2d() through the boundary argument.

  • simplify the border

Step 1: Define the SPDE representation: The SPDE

We use the inla.spde2.pcmatern to define the SPDE model using PC priors through the following probability statements

  • \(P(\rho < 50) = 0.1\)

  • \(P(\sigma > 2) = 0.1\)

Step 2: Define the Detection function

We start by plotting the distances and histogram of frequencies in distance intervals.

Then, we need to define a half-normal detection probability function. This must take distance as its first argument and the linear predictor of the sigma parameter as its second:

Example: Dolphins in the Gulf of Mexico

The LGCP Model

\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \color{#FF6B6B}{\boxed{\beta_0}} + \color{#FF6B6B}{\boxed{ \omega(s)}} + \color{#FF6B6B}{\boxed{ \log p(s)}} \\ \end{aligned} \]

The code

# define model component
cmp = ~ Intercept(1) + 
  space(main = geometry, model = spde_model) +
  sigma(1,
    prec.linear = 1,
    marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
  ) 

# define model predictor
eta  = geometry + distance ~ space +
  log(hn(distance, sigma)) +
  Intercept + log(2)

# build the observation model
lik = bru_obs("cp",
              formula = eta,
              data = mexdolphin$points,
              ips = ips)

# fit the model
fit = bru(cmp, lik)

The integration scheme

# build integration scheme
distance_domain <-  fm_mesh_1d(seq(0, 8,
                              length.out = 30))
ips = fm_int(list(geometry = mesh,
                  distance = distance_domain),
             samplers = mexdolphin$samplers)

Example: Dolphins in the Gulf of Mexico

The LGCP Model

\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \color{#FF6B6B}{\boxed{\eta(s)}} & = \color{#FF6B6B}{\boxed{\beta_0 + \omega(s) + \log p(s)}}\\ \end{aligned} \]

The code

# define model component
cmp = ~ Intercept(1) + 
  space(main = geometry, model = spde_model) +
  sigma(1,
    prec.linear = 1,
    marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
  ) 

# define model predictor
eta  = geometry + distance ~ space +
  log(hn(distance, sigma)) +
  Intercept + log(2)

# build the observation model
lik = bru_obs("cp",
              formula = eta,
              data = mexdolphin$points,
              ips = ips)

# fit the model
fit = bru(cmp, lik)

The integration scheme

# build integration scheme
distance_domain <-  fm_mesh_1d(seq(0, 8,
                              length.out = 30))
ips = fm_int(list(geometry = mesh,
                  distance = distance_domain),
             samplers = mexdolphin$samplers)

Example: Dolphins in the Gulf of Mexico

The LGCP Model

\[ \begin{aligned} \color{#FF6B6B}{\boxed{p(\mathbf{y} | \lambda)}} & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \beta_0 + \omega(s) + \log p(s) \\ \end{aligned} \]

The code

# define model component
cmp = ~ Intercept(1) + 
  space(main = geometry, model = spde_model) +
  sigma(1,
    prec.linear = 1,
    marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
  ) 

# define model predictor
eta  = geometry + distance ~ space +
  log(hn(distance, sigma)) +
  Intercept + log(2)

# build the observation model
lik = bru_obs("cp",
              formula = eta,
              data = mexdolphin$points,
              ips = ips)

# fit the model
fit = bru(cmp, lik)

The integration scheme

# build integration scheme
distance_domain <-  fm_mesh_1d(seq(0, 8,
                              length.out = 30))
ips = fm_int(list(geometry = mesh,
                  distance = distance_domain),
             samplers = mexdolphin$samplers)

Example: Dolphins in the Gulf of Mexico

The LGCP Model

\[ \begin{aligned} \color{#FF6B6B}{\boxed{p(\mathbf{y} | \lambda)}} & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \beta_0 + \omega(s) + \log p(s) \\ \end{aligned} \]

The code

# define model component
cmp = ~ Intercept(1) + 
  space(main = geometry, model = spde_model) +
  sigma(1,
    prec.linear = 1,
    marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
  ) 

# define model predictor
eta  = geometry + distance ~ space +
  log(hn(distance, sigma)) +
  Intercept + log(2)

# build the observation model
lik = bru_obs("cp",
              formula = eta,
              data = mexdolphin$points,
              ips = ips)

# fit the model
fit = bru(cmp, lik)

The integration scheme

# build integration scheme
distance_domain <-  fm_mesh_1d(seq(0, 8,
                              length.out = 30))
ips = fm_int(list(geometry = mesh,
                  distance = distance_domain),
             samplers = mexdolphin$samplers)

Results

Results: posterior summaries

We can use the fit$summary.fixed and summary.hyperpar to obtain posterior summaries of the mdoel parameters.

mean 0.025quant 0.975quant
Intercept −8.41 −9.47 −7.62
sigma −0.05 −0.46 0.36
Range for space 131.74 41.79 320.28
Stdev for space 1.17 0.72 1.78

The spde.posterior allow us to plot the posterior density of the Matern field parameters

Results: posterior summaries

We can use the fit$summary.fixed and summary.hyperpar to obtain posterior summaries of the model parameters.

mean 0.025quant 0.975quant
Intercept −8.41 −9.47 −7.62
sigma −0.05 −0.46 0.36
Range for space 131.74 41.79 320.28
Stdev for space 1.17 0.72 1.78

The spde.posterior allow us to plot the posterior density of the Matern field parameters

Results: predicted densities

To map the spatial intensity we first need to define a grid of points where we want to predict.

  • We do this using the function fm_pixel() which creates a regular grid of points covering the mesh
  • Then, we use the predict function which takes as input
    • the fitted model (fit)
    • the prediction points (pxl)
    • the model components we want to predict (e.g., \(e^{\beta_0 + \xi(s)}\))

Results: predicted densities

We can also use the predict the detection function:

distdf <- data.frame(distance = seq(0, 8, length.out = 100))
dfun <- predict(fit, distdf, ~ hn(distance, sigma))

Results: predicted expected counts

We can look at the posterior for the mean expected number of dolphins. Remember, the number of dolphins over the whole domain \(\Omega\) is \[ N(\Omega)\sim\text{Poisson}(E_{\Omega}), \text{ with } E_{\Omega} = \int_{\Omega}\lambda(s)ds \] so the expected mean number of dolphins is \[ E_{\Omega} = \int_{\Omega}\lambda(s)ds = \int_{\Omega}\exp\{\beta_0+\omega(s)\}ds \]

predpts <- fm_int(mexdolphin$mesh, mexdolphin$ppoly)
Lambda <- generate(fit, predpts, ~ sum(weight * exp(space + Intercept)),
                   n.samples = 3000)

Estimate:

Results: predicted expected counts

If want to predict the expected counts We can also get Monte Carlo samples for the expected number of dolphins as follows:

[1] 601  10